Mining Relations Among Cross-Frame Affinities for Video Semantic Segmentation

نویسندگان

چکیده

The essence of video semantic segmentation (VSS) is how to leverage temporal information for prediction. Previous efforts are mainly devoted developing new techniques calculate the cross-frame affinities such as optical flow and attention. Instead, this paper contributes from a different angle by mining relations among affinities, upon which better aggregation could be achieved. We explore in two aspects: single-scale intrinsic correlations multi-scale relations. Inspired traditional feature processing, we propose Single-scale Affinity Refinement (SAR) Multi-scale Aggregation (MAA). To make it feasible execute MAA, Selective Token Masking (STM) strategy select subset consistent reference tokens scales when calculating also improves efficiency our method. At last, strengthened SAR MAA adopted adaptively aggregating information. Our experiments demonstrate that proposed method performs favorably against state-of-the-art VSS methods. code publicly available at https://github.com/GuoleiSun/VSS-MRCFA .

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Granularity Graph Inference for Semantic Video Object Segmentation

We address semantic video object segmentation via a novel cross-granularity hierarchical graphical model to integrate tracklet and object proposal reasoning with superpixel labeling. Tracklet characterizes varying spatial-temporal relations of video object which, however, quite often suffers from sporadic local outliers. In order to acquire highquality tracklets, we propose a transductive infer...

متن کامل

Semantic segmentation and description for video transcoding

We present an automatic content-based video transcoding algorithm which is based on how humans perceive visual information. The transcoder support multiple video objects and their description. First the video is decomposed into meaningful objects through semantic segmentation. Then the transcoder adapts its behaviour to code relevant (foreground) and non relevant objects differently. Both objec...

متن کامل

Fast Bilateral Solver for Semantic Video Segmentation

We apply the fast bilateral solver technique to the problem of real-time semantic video segmentation. While structured prediction by a dense CRF is accurate on video datasets, the performance is not adequate for real-time segmentation. We hope to utilize the efficient smoothing methodology from the fast bilateral solver within the video segmentation framework introduced by Kundu et al. [9], imp...

متن کامل

Clockwork Convnets for Video Semantic Segmentation

Recent years have seen tremendous progress in still-image segmentation; however the naı̈ve application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video. We propose a video recognition framework that relies on two key observations: 1) while pixels may change rapidly from frame to frame, the semantic c...

متن کامل

Mining Semantic Relations between Research Areas Conference Item Mining Semantic Relations between Research Areas

For a number of years now we have seen the emergence of repositories of research data specified using OWL/RDF as representation languages, and conceptualized according to a variety of ontologies. This class of solutions promises both to facilitate the integration of research data with other relevant sources of information and also to support more intelligent forms of querying and exploration. H...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-19830-4_30